Overview

Dataset Statistics

Number of Variables 25
Number of Rows 4943
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 1.5 MB
Average Row Size in Memory 309.0 B
Variable Types
  • Numerical: 17
  • Categorical: 8

Dataset Insights

zestimateLowPercent and zestimateHighPercent have similar distributions Similar Distribution
zestimate and price have similar distributions Similar Distribution
longitude is skewed Skewed
countyFIPS is skewed Skewed
monthlyHoaFee is skewed Skewed
annualHomeownersInsurance is skewed Skewed
yearBuilt is skewed Skewed
latitude is skewed Skewed
rentZestimate is skewed Skewed
zestimateLowPercent is skewed Skewed
timeOnZillow is skewed Skewed
zestimate is skewed Skewed
livingArea is skewed Skewed
zipcode is skewed Skewed
propertyTaxRate is skewed Skewed
bathrooms is skewed Skewed
bedrooms is skewed Skewed
price is skewed Skewed
zestimateHighPercent is skewed Skewed
city has a high cardinality: 64 distinct values High Cardinality
state has constant length 2 Constant Length
homeType_APARTMENT has constant length 1 Constant Length
homeType_CONDO has constant length 1 Constant Length
homeType_LOT has constant length 1 Constant Length
homeType_MANUFACTURED has constant length 1 Constant Length
homeType_MULTI_FAMILY has constant length 1 Constant Length
homeType_SINGLE_FAMILY has constant length 1 Constant Length
longitude has 4943 (100.0%) negatives Negatives
monthlyHoaFee has 3954 (79.99%) zeros Zeros
  • 1
  • 2
  • 3

Variables


longitude

numerical

Approximate Distinct Count 3695
Approximate Unique (%) 74.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean -148.979
Minimum -150.0109
Maximum -70.4831
Zeros 0
Zeros (%) 0.0%
Negatives 4943
Negatives (%) 100.0%
  • longitude is skewed right (γ1 = 8.691)

Quantile Statistics

Minimum -150.0109
5-th Percentile -149.9591
Q1 -149.9289
Median -149.8728
Q3 -149.8133
95-th Percentile -149.7279
Maximum -70.4831
Range 79.5278
IQR 0.1156

Descriptive Statistics

Mean -148.979
Standard Deviation 7.6135
Variance 57.9651
Sum -736403.0772
Skewness 8.691
Kurtosis 75.0826
Coefficient of Variation -0.0511
  • longitude is not normally distributed (p-value 4.229023872783742e-25)
  • longitude has 69 outliers

countyFIPS

numerical

Approximate Distinct Count 44
Approximate Unique (%) 0.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 2315.6472
Minimum 1117
Maximum 55079
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • countyFIPS is skewed right (γ1 = 12.0282)

Quantile Statistics

Minimum 1117
5-th Percentile 2020
Q1 2020
Median 2020
Q3 2020
95-th Percentile 2020
Maximum 55079
Range 53962
IQR 0

Descriptive Statistics

Mean 2315.6472
Standard Deviation 3049.2515
Variance 9.2979e+06
Sum 1.1446e+07
Skewness 12.0282
Kurtosis 156.0341
Coefficient of Variation 1.3168
  • countyFIPS is not normally distributed (p-value 4.229073161270118e-25)
  • countyFIPS has 68 outliers

monthlyHoaFee

numerical

Approximate Distinct Count 213
Approximate Unique (%) 4.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 57.6729
Minimum 0
Maximum 45929
Zeros 3954
Zeros (%) 80.0%
Negatives 0
Negatives (%) 0.0%
  • monthlyHoaFee is skewed right (γ1 = 66.5721)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 0
95-th Percentile 348
Maximum 45929
Range 45929
IQR 0

Descriptive Statistics

Mean 57.6729
Standard Deviation 664.5638
Variance 441644.985
Sum 285077
Skewness 66.5721
Kurtosis 4591.1575
Coefficient of Variation 11.523
  • monthlyHoaFee is not normally distributed (p-value 4.226866352157144e-25)
  • monthlyHoaFee has 989 outliers

annualHomeownersInsurance

numerical

Approximate Distinct Count 2074
Approximate Unique (%) 42.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 1681.3073
Minimum 5
Maximum 11550
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • annualHomeownersInsurance is skewed right (γ1 = 2.1053)

Quantile Statistics

Minimum 5
5-th Percentile 660.1
Q1 1245
Median 1630
Q3 1972
95-th Percentile 2894.4
Maximum 11550
Range 11545
IQR 727

Descriptive Statistics

Mean 1681.3073
Standard Deviation 740.1802
Variance 547866.7354
Sum 8.3107e+06
Skewness 2.1053
Kurtosis 13.6122
Coefficient of Variation 0.4402
  • annualHomeownersInsurance is not normally distributed (p-value 5.805274740062222e-10)
  • annualHomeownersInsurance has 197 outliers

state

categorical

Approximate Distinct Count 27
Approximate Unique (%) 0.5%
Missing 0
Missing (%) 0.0%
Memory Size 331181
  • The largest value (AK) is over 609.38 times larger than the second largest value (FL)

Length

Mean 2
Standard Deviation 0
Median 2
Minimum 2
Maximum 2

Sample

1st row AK
2nd row AK
3rd row AK
4th row AK
5th row AK

Letter

Count 9886
Lowercase Letter 0
Space Separator 0
Uppercase Letter 9886
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (AK, FL) take over 50.0%
  • The largest value (ak) is over 609.38 times larger than the second largest value (fl)
  • state has words of constant length

yearBuilt

numerical

Approximate Distinct Count 91
Approximate Unique (%) 1.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 1976.1088
Minimum 1880
Maximum 2022
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • yearBuilt is skewed left (γ1 = -0.1903)

Quantile Statistics

Minimum 1880
5-th Percentile 1952
Q1 1969
Median 1977
Q3 1983
95-th Percentile 1998
Maximum 2022
Range 142
IQR 14

Descriptive Statistics

Mean 1976.1088
Standard Deviation 13.1658
Variance 173.3382
Sum 9.7679e+06
Skewness -0.1903
Kurtosis 1.6895
Coefficient of Variation 0.006662
  • yearBuilt is not normally distributed (p-value 6.059408007527494e-09)
  • yearBuilt has 135 outliers

latitude

numerical

Approximate Distinct Count 4014
Approximate Unique (%) 81.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 60.8252
Minimum 26.0047
Maximum 61.2312
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • latitude is skewed left (γ1 = -8.8491)

Quantile Statistics

Minimum 26.0047
5-th Percentile 61.1213
Q1 61.1462
Median 61.172
Q3 61.1974
95-th Percentile 61.2192
Maximum 61.2312
Range 35.2265
IQR 0.05112

Descriptive Statistics

Mean 60.8252
Standard Deviation 2.9798
Variance 8.879
Sum 300659.1763
Skewness -8.8491
Kurtosis 79.5808
Coefficient of Variation 0.04899
  • latitude is not normally distributed (p-value 4.22880277108126e-25)
  • latitude has 69 outliers

rentZestimate

numerical

Approximate Distinct Count 2176
Approximate Unique (%) 44.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 2647.388
Minimum 782
Maximum 11544
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • rentZestimate is skewed right (γ1 = 1.1522)

Quantile Statistics

Minimum 782
5-th Percentile 1593.1
Q1 2176.5
Median 2607
Q3 3015
95-th Percentile 3931.8
Maximum 11544
Range 10762
IQR 838.5

Descriptive Statistics

Mean 2647.388
Standard Deviation 733.2617
Variance 537672.7183
Sum 1.3086e+07
Skewness 1.1522
Kurtosis 5.963
Coefficient of Variation 0.277
  • rentZestimate is not normally distributed (p-value 2.137504703390668e-07)
  • rentZestimate has 149 outliers

city

categorical

Approximate Distinct Count 64
Approximate Unique (%) 1.3%
Missing 0
Missing (%) 0.0%
Memory Size 365802
  • The largest value (Anchorage) is over 1625.0 times larger than the second largest value (New Seabury)

Length

Mean 9.004
Standard Deviation 0.3256
Median 9
Minimum 4
Maximum 17

Sample

1st row Anchorage
2nd row Anchorage
3rd row Anchorage
4th row Anchorage
5th row Anchorage

Letter

Count 44485
Lowercase Letter 39520
Space Separator 22
Uppercase Letter 4965
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Anchorage, New Seabury) take over 50.0%
  • The largest value (anchorage) is over 1218.75 times larger than the second largest value (new)

zestimateLowPercent

numerical

Approximate Distinct Count 24
Approximate Unique (%) 0.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 9.7141
Minimum 5
Maximum 38
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • zestimateLowPercent is skewed right (γ1 = 1.8221)

Quantile Statistics

Minimum 5
5-th Percentile 6
Q1 8
Median 9
Q3 11
95-th Percentile 14
Maximum 38
Range 33
IQR 3

Descriptive Statistics

Mean 9.7141
Standard Deviation 2.4554
Variance 6.029
Sum 48017
Skewness 1.8221
Kurtosis 9.1871
Coefficient of Variation 0.2528
  • zestimateLowPercent is not normally distributed (p-value 1.2433565227625064e-10)
  • zestimateLowPercent has 110 outliers

timeOnZillow

numerical

Approximate Distinct Count 2481
Approximate Unique (%) 50.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 3777.7107
Minimum 1
Maximum 19949
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • timeOnZillow is skewed right (γ1 = 1.2213)

Quantile Statistics

Minimum 1
5-th Percentile 1335.1
Q1 3160
Median 3779
Q3 4394.5
95-th Percentile 6276.9
Maximum 19949
Range 19948
IQR 1234.5

Descriptive Statistics

Mean 3777.7107
Standard Deviation 1563.5603
Variance 2.4447e+06
Sum 1.8673e+07
Skewness 1.2213
Kurtosis 8.6458
Coefficient of Variation 0.4139
  • timeOnZillow is not normally distributed (p-value 2.1566418561312374e-12)
  • timeOnZillow has 492 outliers

zestimate

numerical

Approximate Distinct Count 3198
Approximate Unique (%) 64.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 400226.7449
Minimum 65000
Maximum 2.7518e+06
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • zestimate is skewed right (γ1 = 2.1113)

Quantile Statistics

Minimum 65000
5-th Percentile 157340
Q1 296400
Median 388100
Q3 469550
95-th Percentile 689240
Maximum 2.7518e+06
Range 2.6868e+06
IQR 173150

Descriptive Statistics

Mean 400226.7449
Standard Deviation 176022.8624
Variance 3.0984e+10
Sum 1.9783e+09
Skewness 2.1113
Kurtosis 13.6903
Coefficient of Variation 0.4398
  • zestimate is not normally distributed (p-value 9.067724541471746e-10)
  • zestimate has 195 outliers

livingArea

numerical

Approximate Distinct Count 1860
Approximate Unique (%) 37.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 1812.9887
Minimum 1
Maximum 14500
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • livingArea is skewed right (γ1 = 2.3735)

Quantile Statistics

Minimum 1
5-th Percentile 744
Q1 1171.5
Median 1716
Q3 2147.5
95-th Percentile 3511.8
Maximum 14500
Range 14499
IQR 976

Descriptive Statistics

Mean 1812.9887
Standard Deviation 907.7119
Variance 823940.9724
Sum 8.9616e+06
Skewness 2.3735
Kurtosis 16.435
Coefficient of Variation 0.5007
  • livingArea is not normally distributed (p-value 4.767691142561027e-09)
  • livingArea has 226 outliers

zipcode

numerical

Approximate Distinct Count 75
Approximate Unique (%) 1.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 98716.4337
Minimum 2649
Maximum 99518
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • zipcode is skewed left (γ1 = -10.1814)

Quantile Statistics

Minimum 2649
5-th Percentile 99501
Q1 99502
Median 99507
Q3 99515
95-th Percentile 99518
Maximum 99518
Range 96869
IQR 13

Descriptive Statistics

Mean 98716.4337
Standard Deviation 7461.2715
Variance 5.5671e+07
Sum 4.8796e+08
Skewness -10.1814
Kurtosis 106.9219
Coefficient of Variation 0.07558
  • zipcode is not normally distributed (p-value 4.227835871687671e-25)
  • zipcode has 68 outliers

propertyTaxRate

numerical

Approximate Distinct Count 56
Approximate Unique (%) 1.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 1.307
Minimum 0
Maximum 2.43
Zeros 2
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • propertyTaxRate is skewed left (γ1 = -5.6097)

Quantile Statistics

Minimum 0
5-th Percentile 1.31
Q1 1.31
Median 1.31
Q3 1.31
95-th Percentile 1.31
Maximum 2.43
Range 2.43
IQR 0

Descriptive Statistics

Mean 1.307
Standard Deviation 0.07011
Variance 0.004916
Sum 6460.68
Skewness -5.6097
Kurtosis 157.2347
Coefficient of Variation 0.05364
  • propertyTaxRate is not normally distributed (p-value 4.227591975513623e-25)
  • propertyTaxRate has 69 outliers

bathrooms

numerical

Approximate Distinct Count 25
Approximate Unique (%) 0.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 2.0984
Minimum 0
Maximum 30
Zeros 24
Zeros (%) 0.5%
Negatives 0
Negatives (%) 0.0%
  • bathrooms is skewed right (γ1 = 6.8593)

Quantile Statistics

Minimum 0
5-th Percentile 1
Q1 1.5
Median 2
Q3 2.5
95-th Percentile 3.5
Maximum 30
Range 30
IQR 1

Descriptive Statistics

Mean 2.0984
Standard Deviation 0.9792
Variance 0.9588
Sum 10372.6
Skewness 6.8593
Kurtosis 162.5769
Coefficient of Variation 0.4666
  • bathrooms is not normally distributed (p-value 1.3805558743875348e-17)
  • bathrooms has 83 outliers

bedrooms

numerical

Approximate Distinct Count 14
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 3.2055
Minimum 0
Maximum 30
Zeros 8
Zeros (%) 0.2%
Negatives 0
Negatives (%) 0.0%
  • bedrooms is skewed right (γ1 = 3.5487)

Quantile Statistics

Minimum 0
5-th Percentile 1
Q1 2
Median 3
Q3 4
95-th Percentile 5
Maximum 30
Range 30
IQR 2

Descriptive Statistics

Mean 3.2055
Standard Deviation 1.2508
Variance 1.5644
Sum 15845
Skewness 3.5487
Kurtosis 54.068
Coefficient of Variation 0.3902
  • bedrooms is not normally distributed (p-value 1.2675151491265345e-17)
  • bedrooms has 33 outliers

price

numerical

Approximate Distinct Count 3196
Approximate Unique (%) 64.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 400306.7736
Minimum 1250
Maximum 2750000
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • price is skewed right (γ1 = 2.1053)

Quantile Statistics

Minimum 1250
5-th Percentile 157120
Q1 296400
Median 388200
Q3 469550
95-th Percentile 689240
Maximum 2750000
Range 2748750
IQR 173150

Descriptive Statistics

Mean 400306.7736
Standard Deviation 176233.1449
Variance 3.1058e+10
Sum 1.9787e+09
Skewness 2.1053
Kurtosis 13.6119
Coefficient of Variation 0.4402
  • price is not normally distributed (p-value 5.829618976649828e-10)
  • price has 197 outliers

zestimateHighPercent

numerical

Approximate Distinct Count 30
Approximate Unique (%) 0.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 79088
Mean 9.7858
Minimum 5
Maximum 59
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • zestimateHighPercent is skewed right (γ1 = 3.1706)

Quantile Statistics

Minimum 5
5-th Percentile 7
Q1 8
Median 9
Q3 11
95-th Percentile 14
Maximum 59
Range 54
IQR 3

Descriptive Statistics

Mean 9.7858
Standard Deviation 2.8009
Variance 7.845
Sum 48371
Skewness 3.1706
Kurtosis 29.4945
Coefficient of Variation 0.2862
  • zestimateHighPercent is not normally distributed (p-value 9.996546298460288e-11)
  • zestimateHighPercent has 152 outliers

homeType_APARTMENT

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 326238
  • The largest value (0) is over 111.34 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 4943
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 111.34 times larger than the second largest value (1)
  • homeType_APARTMENT has words of constant length

homeType_CONDO

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 326238
  • The largest value (0) is over 5.58 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 1

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 4943
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 5.58 times larger than the second largest value (1)
  • homeType_CONDO has words of constant length

homeType_LOT

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 326238
  • The largest value (0) is over 705.14 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 4943
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 705.14 times larger than the second largest value (1)
  • homeType_LOT has words of constant length

homeType_MANUFACTURED

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 326238
  • The largest value (0) is over 448.36 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 4943
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 448.36 times larger than the second largest value (1)
  • homeType_MANUFACTURED has words of constant length

homeType_MULTI_FAMILY

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 326238
  • The largest value (0) is over 11.03 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 4943
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 11.03 times larger than the second largest value (1)
  • homeType_MULTI_FAMILY has words of constant length

homeType_SINGLE_FAMILY

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 326238
  • The largest value (1) is over 2.72 times larger than the second largest value (0)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 4943
  • The top 2 categories (1, 0) take over 50.0%
  • The largest value (1) is over 2.72 times larger than the second largest value (0)
  • homeType_SINGLE_FAMILY has words of constant length

Interactions

Correlations

Missing Values